Evaluating the Pronunciation Component of Text-to-Speech Systems for English: A Performance Comparison
نویسندگان
چکیده
The automatic derivation of word pronunciations from input text is a central task for any text-to-speech system. For general English text at least, this is often thought to be a solved problem, with manually-derived linguistic rules assumed capable of handling ‘novel’ words missing from the system dictionary. Data-driven methods, based on machine learning of the regularities implicit in a large pronouncing dictionary, have received considerable attention recently but are generally thought to perform less well. However, these tentative beliefs are at best uncertain without powerful methods for comparing text-to-phoneme subsystems. This paper contributes to the development of such methods by comparing the performance of four representative approaches to automatic phonemisation on the same test dictionary. As well as rule-based approaches, three data-driven techniques are evaluated: pronunciation by analogy (PbA), NETspeak and IB1-IG (a modified k-nearest neighbour method). Issues involved in comparative evaluation are detailed and elucidated. The data-driven techniques outperform rules in accuracy of letter-to-phoneme translation by a very significant margin but require aligned text-phoneme training data and are slower. Best translation results are obtained with PbA at approximately 72% words correct on a reasonably large pronouncing dictionary, compared to something like 26% words correct for the rules, indicating that automatic pronunciation of text is not a solved problem. [208 words]
منابع مشابه
Computer Assisted Pronunciation Teaching (CAPT) and Pedagogy: Improving EFL learners’ Pronunciation Using Clear Pronunciation 2 Software
This study examined the impact of Clear Pronunciation 2 software on teaching English suprasegmental features, focusing on stress, rhythm and intonation. In particular, the software covers five topics in relation to suprasegmental features including consonant cluster, word stress, connected speech, sentence stress and intonation. Seven Iranian EFL learners participated in this study. The study l...
متن کاملAdvantages of Using Computer in Teaching English Pronunciation
Pronunciation continues to grow in importance because of its key roles in speech recognition, speech perception, and speaker identity. Computer is being increasingly used in teaching English pronunciation to enhance its quality. The purpose of this paper is to discuss the advantages of using computer in English pronunciation instruction. Understanding the advantages of computer is an important ...
متن کاملEvaluating the pronunciation component of text-to-speech systems for English: a performance comparison of different approaches
The automatic derivation of word pronunciations from input text is a central task for any text-to-speech system. For general English text at least, this is often thought to be a solved problem, with manually-derived linguistic rules assumed capable of handling “novel” words missing from the system dictionary. Data-driven methods, based on machine learning of the regularities implicit in a large...
متن کاملProsodic elements to improve pronunciation in English language learners: A short report
The usefulness of teaching pronunciation in language instruction remains controversial. Though past research suggests that teachers can make little or no difference in improving their students’ pronunciation, current findings suggest that second language pronunciation can improve to be near native-like with the implementation of certain criteria such as the utilization of...
متن کاملPronunciation Variants Across Systems, Languages and Speaking Style
This contribution aims at evaluating the use of pronunciation variants across different system configurations, languages and speaking styles. This study is limited to the use of variants during speech alignment, given an orthographic transcription and a phonemically represented lexicon, thus focusing on the modeling abilities of the acoustic word models. Parallel and sequential variants are tes...
متن کامل